Goto

Collaborating Authors

 probabilistic context-free grammar


Non-autoregressive Machine Translation with Probabilistic Context-free Grammar

Neural Information Processing Systems

Non-autoregressive Transformer(NAT) significantly accelerates the inference of neural machine translation. However, conventional NAT models suffer from limited expression power and performance degradation compared to autoregressive (AT) models due to the assumption of conditional independence among target tokens. To address these limitations, we propose a novel approach called PCFG-NAT, which leverages a specially designed Probabilistic Context-Free Grammar (PCFG) to enhance the ability of NAT models to capture complex dependencies among output tokens. Experimental results on major machine translation benchmarks demonstrate that PCFG-NAT further narrows the gap in translation quality between NAT and AT models. Moreover, PCFG-NAT facilitates a deeper understanding of the generated sentences, addressing the lack of satisfactory explainability in neural machine translation.


Non-autoregressive Machine Translation with Probabilistic Context-free Grammar

Neural Information Processing Systems

Non-autoregressive Transformer(NAT) significantly accelerates the inference of neural machine translation. However, conventional NAT models suffer from limited expression power and performance degradation compared to autoregressive (AT) models due to the assumption of conditional independence among target tokens. To address these limitations, we propose a novel approach called PCFG-NAT, which leverages a specially designed Probabilistic Context-Free Grammar (PCFG) to enhance the ability of NAT models to capture complex dependencies among output tokens. Experimental results on major machine translation benchmarks demonstrate that PCFG-NAT further narrows the gap in translation quality between NAT and AT models. Moreover, PCFG-NAT facilitates a deeper understanding of the generated sentences, addressing the lack of satisfactory explainability in neural machine translation.


Bayesian Decision Trees via Tractable Priors and Probabilistic Context-Free Grammars

arXiv.org Artificial Intelligence

Decision Trees are some of the most popular machine learning models today due to their out-of-the-box performance and interpretability. Often, Decision Trees models are constructed greedily in a top-down fashion via heuristic search criteria, such as Gini impurity or entropy. However, trees constructed in this manner are sensitive to minor fluctuations in training data and are prone to overfitting. In contrast, Bayesian approaches to tree construction formulate the selection process as a posterior inference problem; such approaches are more stable and provide greater theoretical guarantees. However, generating Bayesian Decision Trees usually requires sampling from complex, multimodal posterior distributions. Current Markov Chain Monte Carlo-based approaches for sampling Bayesian Decision Trees are prone to mode collapse and long mixing times, which makes them impractical. In this paper, we propose a new criterion for training Bayesian Decision Trees. Our criterion gives rise to BCART-PCFG, which can efficiently sample decision trees from a posterior distribution across trees given the data and find the maximum a posteriori (MAP) tree. Learning the posterior and training the sampler can be done in time that is polynomial in the dataset size. Once the posterior has been learned, trees can be sampled efficiently (linearly in the number of nodes). At the core of our method is a reduction of sampling the posterior to sampling a derivation from a probabilistic context-free grammar. We find that trees sampled via BCART-PCFG perform comparable to or better than greedily-constructed Decision Trees in classification accuracy on several datasets. Additionally, the trees sampled via BCART-PCFG are significantly smaller -- sometimes by as much as 20x.


P(Expression|Grammar): Probability of deriving an algebraic expression with a probabilistic context-free grammar

arXiv.org Artificial Intelligence

Probabilistic context-free grammars have a long-term record of use as generative models in machine learning and symbolic regression. When used for symbolic regression, they generate algebraic expressions. We define the latter as equivalence classes of strings derived by grammar and address the problem of calculating the probability of deriving a given expression with a given grammar. We show that the problem is undecidable in general. We then present specific grammars for generating linear, polynomial, and rational expressions, where algorithms for calculating the probability of a given expression exist. For those grammars, we design algorithms for calculating the exact probability and efficient approximation with arbitrary precision.


Password Cracking

#artificialintelligence

On hearing the term "password-cracking," many will think this post will be about how to guess someone's password or somewhat similar, but the reality is not always so satisfying. In general, whenever anybody types a password on any device or software, passwords don't get stored in the raw format in the database. Instead, raw passwords are first passed through the hashing algorithm, which converts the raw passwords into some particular sequence of letters, numbers, and special characters which looks entirely random for an ordinary being. Now there are several password database leaks and breaches all over the world. One such dataset is Rockyou Dataset, which contains about 31 million passwords; this is a widely used dataset because this dataset contains passwords in plain text format without any hashing.


Password Cracking

#artificialintelligence

On hearing the term "password-cracking," many will think this post will be about how to guess someone's password or somewhat similar, but the reality is not always so satisfying. In general, whenever anybody types a password on any device or software, passwords don't get stored in the raw format in the database. Instead, raw passwords are first passed through the hashing algorithm, which converts the raw passwords into some particular sequence of letters, numbers, and special characters which looks entirely random for an ordinary being. Now there are several password database leaks and breaches all over the world. One such dataset is Rockyou Dataset, which contains about 31 million passwords; this is a widely used dataset because this dataset contains passwords in plain text format without any hashing.


Probabilistic Grammatical Evolution

arXiv.org Artificial Intelligence

Grammatical Evolution (GE) is one of the most popular Genetic Programming (GP) variants, and it has been used with success in several problem domains. Since the original proposal, many enhancements have been proposed to GE in order to address some of its main issues and improve its performance. In this paper we propose Probabilistic Grammatical Evolution (PGE), which introduces a new genotypic representation and new mapping mechanism for GE. Specifically, we resort to a Probabilistic Context-Free Grammar (PCFG) where its probabilities are adapted during the evolutionary process, taking into account the productions chosen to construct the fittest individual. The genotype is a list of real values, where each value represents the likelihood of selecting a derivation rule. We evaluate the performance of PGE in two regression problems and compare it with GE and Structured Grammatical Evolution (SGE). The results show that PGE has a a better performance than GE, with statistically significant differences, and achieved similar performance when comparing with SGE.


Probabilistic Grammars for Equation Discovery

arXiv.org Machine Learning

Equation discovery, also known as symbolic regression, is a type of automated modeling that discovers scientific laws, expressed in the form of equations, from observed data and expert knowledge. Deterministic grammars, such as context-free grammars, have been used to limit the search spaces in equation discovery by providing hard constraints that specify which equations to consider and which not. In this paper, we propose the use of probabilistic context-free grammars in the context of equation discovery. Such grammars encode soft constraints on the space of equations, specifying a prior probability distribution on the space of possible equations. We show that probabilistic grammars can be used to elegantly and flexibly formulate the parsimony principle, that favors simpler equations, through probabilities attached to the rules in the grammars. We demonstrate that the use of probabilistic, rather than deterministic grammars, in the context of a Monte-Carlo algorithm for grammar-based equation discovery, leads to more efficient equation discovery. Finally, by specifying prior probability distributions over equation spaces, the foundations are laid for Bayesian approaches to equation discovery.


Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models

Neural Information Processing Systems

This paper introduces adaptor grammars, a class of probabilistic models of language that generalize probabilistic context-free grammars (PCFGs). Adaptor grammars augment the probabilistic rules of PCFGs with "adaptors" that can induce dependencies among successive uses. With a particular choice of adaptor, based on the Pitman-Yor process, nonparametric Bayesian models of language using Dirichlet processes and hierarchical Dirichlet processes can be written as simple grammars. We present a general-purpose inference algorithm for adaptor grammars, making it easy to define and use such models, and illustrate how several existing nonparametric Bayesian models can be expressed within this framework.


Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models

Neural Information Processing Systems

This paper introduces adaptor grammars, a class of probabilistic models of language that generalize probabilistic context-free grammars (PCFGs). Adaptor grammars augment the probabilistic rules of PCFGs with "adaptors" that can induce dependencies among successive uses. With a particular choice of adaptor, based on the Pitman-Yor process, nonparametric Bayesian models of language using Dirichlet processes and hierarchical Dirichlet processes can be written as simple grammars. We present a general-purpose inference algorithm for adaptor grammars, making it easy to define and use such models, and illustrate how several existing nonparametric Bayesian models can be expressed within this framework.